home *** CD-ROM | disk | FTP | other *** search
- Short: MuLib based math speedup patch for 040/060
- Author: thor@math.tu-berlin.de (Thomas Richter)
- Uploader: thor@math.tu-berlin.de (Thomas Richter)
- Type: util/boot
- Requires: util/libs/MMULib.lha Kickstart V37, MuLib 68040/68060 lib
-
-
- MuRedox is a MuLib based "on the fly" speedup patch for 68040
- and 68060 based Amiga boards. The 68040 and 68060 do not implement
- all instructions of the MC 68K family. The unimplemented instructions -
- mainly FPU instructions - generate an exception and need to be
- emulated by the 68040 resp. 68060.library. This is the job of the so-
- called "FPSP routines" (floating point support package) within the
- CPU libraries. MuRedox detects these instructions as soon as they
- generate the emulator exceptions, runs a "just-in-time" compiler
- that generates a "stub replacement routine" for this specific
- instruction and patches the replacement routine into the running
- program. Hence, MuRedox replaces the overhead of the emulator
- trap on the next use of the same instruction sequence.
-
- Therefore, MuRedox requires:
-
- - At least an 68040 or an 68060. A 68030 or below implements all
- instructions anyhow and does not require emulator traps at all.
-
- - The mmu.library which is required to setup the special memory
- mapping.
-
- - The fpsp.resource. This resource contains the program code required
- for most unimplemented math functions. This resource is made avail-
- available by the mmu.library-based 68040 and 68060.library.
-
- Therefore, installation of the "MuLib" libraries is *required*.
- Please see the "MMULib.lha" package on Aminet how to install them.
-
- ------------------------------------------------------------------------------
-
- Top reasons why not to use this program:
-
- - It is a hack. MuRedox replaces program code on the fly, hoping that all
- will go well. This need not to be the case - especially commercial
- programs may keep a checksum over their code and may fail if their
- code gets altered. MuRedox will perform such code modifications.
-
- - MuRedox will therefore not work for all programs - some
- incompatibilities should be expected.
-
- If you need faster programs, you should rather:
-
- - Ask the vendor for a 68060 or 68040 specific release of the program
- that does not require the software emulated instructions of the 68040
- resp. 68060. Typically, these versions will run faster than a
- 68020/68030 version with MuRedox, anyhow.
-
- - Remember: Programs are made fast by fast and smart algorithms, not by
- your favourite speedup-patch. MuRedox will give some speed impact,
- in realistic situations in the range of at most 10%. Specific bench-
- marks may show more dramatic improvements, but they typically test
- situations that are untypical in a real-life situation. Motorola
- choose less frequently used instructions for software emulation in
- first place, hence improvements are typically marginal.
-
- ------------------------------------------------------------------------------
-
- Implementation specific:
-
- MuRedox is not the first program of this kind. Two other implementations
- exist (OxyPatcher and CyberPatcher). MuRedox tries to be better than the
- two, of course: (-:
-
- - It uses the fpsp.resource for the calculations. This means that the
- result of the calculation is *guaranteed* to be the same than what
- "the real thing" would have generated. The fpsp.resource is internally
- used by the 68040 and 68060.library as well for handling the emulator
- traps. This means that even picky side conditions (as on the rounding
- mode and exception generation) are kept care of. If a software does
- *NOT* check its own code, then the reaction of the system with MuRedox
- running is undistinguishable from a system running the 040 and 060
- library.
- - It is mmu.library aware. This means specifically that it will respect
- "virtual memory" situations where the corresponding code is swapped
- out, or will not try to overwrite code that is write-protected. Hence,
- it will also avoid MuForce hits. It also cooperates nicely with other
- MuTools by its design.
- - MuRedox includes emulation routines for almost all instructions that
- would cause emulator traps.
- - MuRedox offers a special feature for instructions causing exceptions,
- as for example a "divide by zero". In this case, "MuRedox" removes
- the emulation routine of the offending instruction and replaces it by
- the real instruction again. Programs checking their code especially
- in these situations will therefore never see the "MuRedox" patches
- and the program counter of the faulting instruction will be "right
- in place". This trick has also the advantage of generating exactly
- the same instruction type and status as the emulation routines within
- the 68040/68060.library - simply because this emulation routine gets
- used!
-
- ------------------------------------------------------------------------------
-
- SYNOPSIS:
-
- MuRedox EMULATORSIZE=EMUSIZ/N
-
- EMULATORSIZE: Size in K bytes of the output buffer for the
- just-in-time compiler. Emulated instruction sequences
- go into this buffer. If this buffer overruns,
- MuRedox will no longer be able to replace further
- emulated instructions.
-
- Defaults to 64K which has shown to be more than
- sufficient for normal operation; typically not more
- than 16K are required.
-
- NOTE: MuRedox cannot be removed once it is running. This is intentional as
- MuRedox has no control over which programs got altered and whether these
- alterations are still active.
-
- ------------------------------------------------------------------------------
-
- The THOR-Software Licence (v2, 24th June 1998)
-
-
- This License applies to the computer programs known as "mmu.library",
- "MuRedox", "FPSPSnoop" and the corresponding documentation, known as
- ".readme" files. The "Program", below, refers to such program. The "Archive"
- refers to the package of distribution, as prepared by the author of the
- Program, Thomas Richter. Each licensee is addressed as "you".
-
-
-
- The Program and the data in the archive are freely distributable
- under the restrictions stated below, but are also Copyright (c)
- Thomas Richter.
-
- Distribution of the Program, the Archive and the data in the Archive by a
- commercial organization without written permission from the author to any
- third party is prohibited if any payment is made in connection with such
- distribution, whether directly (as in payment for a copy of the Program) or
- indirectly (as in payment for some service related to the Program, or
- payment for some product or service that includes a copy of the Program
- "without charge"; these are only examples, and not an exhaustive enumeration
- of prohibited activities).
-
-
- However, the following methods of distribution
- involving payment shall not in and of themselves be a violation of this
- restriction:
-
-
- (i) Posting the Program on a public access information storage and
- retrieval service for which a fee is received for retrieving information
- (such as an on-line service), provided that the fee is not
- content-dependent (i.e., the fee would be the same for retrieving the same
- volume of information consisting of random data).
-
-
- (ii) Distributing the Program on a CD-ROM, provided that
-
- a) the Archive is reproduced entirely and verbatim on such CD-ROM, including
- especially this licence agreement;
-
- b) the CD-ROM is made available to the public for a nominal fee only,
-
- c) a copy of the CD is made available to the author for free except for
- shipment costs, and
-
- d) provided further that all information on such CD-ROM is re-distributable
- for non-commercial purposes without charge.
-
-
- Redistribution of a modified version of the Archive, the Program or the
- contents of the Archive is prohibited in any way, by any organization,
- regardless whether commercial or non-commercial. Everything must be kept
- together, in original and unmodified form.
-
-
-
-
- Limitations.
-
-
- THE PROGRAM IS PROVIDED TO YOU "AS IS", WITHOUT WARRANTY. THERE IS NO
- WARRANTY FOR THE PROGRAM, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT
- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
- PARTICULAR PURPOSE AND NON-INFRINGEMENT OF THIRD PARTY RIGHTS. THE ENTIRE
- RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD
- THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
- SERVICING, REPAIR OR CORRECTION.
-
-
- IF YOU DO NOT ACCEPT THIS LICENCE, YOU MUST DELETE THE PROGRAM, THE ARCHIVE
- AND ALL DATA OF THIS ARCHIVE FROM YOUR STORAGE SYSTEM. YOU ACCEPT THIS
- LICENCE BY USING OR REDISTRIBUTING THE PROGRAM.
-
-
- Thomas Richter
-
- ______________________________________________________________________________
-
- Installation:
-
- MuRedox requires the 68040 and 68060.libraries to be installed. Please check
- the documentation of the apropriate archives on Aminet and install the
- latest versions of the two libraries found in this archive. They fix a minor
- bug in the fpsp.resource with is irrelevant for normal operation.
-
- MuRedox should then be run from the shell without arguments.
-
- *WARNING* Once MuRedox is run, you can't quit it. This is because it is
- not clear whether any of the replacement stub routines are still in use
- when the program would try to exit.
-
- ______________________________________________________________________________
-
- In case of trouble:
-
- ...or if you think that MuRedox does not speedup your favourite program.
-
-
- MuRedox is a pretty tricky program, it might be that this beta-release
- contains some bugs, so testing is mandatory. The same goes for situations
- where you think that MuRedox "misses" some instructions that might slow-
- down your favourite application. For tracing down these problems, you
- need the "FPSPSnoop" program and the "disassembler.library" within this
- archive, and possibly a second computer or "Sashimi". The latter can be
- found in the Aminet in a separate archive and is not included.
-
- Usage of FPSPSnoop is pretty simple: Run it from the shell without
- arguments. If you do not own a second computer and a null-modem cable,
- you need to run Sashimi as well. FPSPSnoop outputs will then go into the
- Sashimi window.
-
- The order how programs are run determinates the logic of what is "snooped"
- by FPSPSnoop:
-
- - Run FPSPSnoop (and Sashimi) first, MuRedox later: Then FPSPnoop will only
- snoop instructions that are not handled by MuRedox. This might be useful to
- detect cases where MuRedox misses some instructions.
-
- - Run MuRedox first, FPSPSnoop (and Sashimi) later: This will snoop *all*
- instructions that will be emulated before they are replaced by MuRedox. This
- is useful when you suspect that one of the emulation routines of MuRedox
- is broken.
-
- In either case, I would need the output of FPSPSnoop to fix any problem!
- ______________________________________________________________________________
-
- Emulated instructions:
-
- Thanks to its "just in time compiler", MuRedox is able to replace almost
- all instructions that would cause emulator traps. MuRedox does not come
- with a fixed set of stub-routines that are just patched in, it *creates*
- these routines as soon as the problem is detected. Therefore, it is maybe
- easier to say what MuRedox does not emulate rather than what it does...
-
- Integer instructions that get replaced:
-
- mulu 64 bit variant, all addressing modes, 68060 only, native on 68040
- muls 64 bit variant, all addressing modes, 68060 only, native on 68040
- divu 64 bit variant, all addressing modes, 68060 only, native on 68040
- divs 64 bit variant, all addressing modes, 68060 only, native on 68040
- movep all addressing modes, 68060 only, native on 68040
- cmp2 all addressing modes, 68060 only, native on 68040
-
- Integer instructions that get not patched:
-
- chk2 This instruction is especially useful for debugging as it will
- cause a "trap" if a condition to be checked is not fulfilled.
- Replacing this instruction may confuse an active debugger.
- Except that I haven't seen any program that makes use of it.
- cas (with odd addresses only) Not useable on amiga hardware due to
- the unsupported read-modify-write cycle that might break DMA
- access.
- cas2 Not usable on Amiga hardware due to the unsupported
- read-modify-write cycle.
-
- Integer addressing modes:
-
- All supported except (a7)+ as target and -(a7) as source. Both addressing
- modes break the logic of a stack and won't work in a multitasking environ-
- ment either. The 68040 and 68060 emulator core doesn't support them either.
-
- FPU general instructions:
-
- The following are handled by forwarding the emulation to the fpsp.resource:
-
- facos all addressing modes
- fasin all addressing modes
- fatan all addressing modes
- fatanh all addressing modes
- fcos all addressing modes
- fcosh all addressing modes
- fetox all addressing modes
- fetoxm1 all addressing modes
- fgetexp all addressing modes
- fgetman all addressing modes
- fint all addressing modes, 68040 only, native on 68060
- fintrz all addressing modes, 68040 only, native on 68060
- flog10 all addressing modes
- flog2 all addressing modes
- flogn all addressing modes
- flognp1 all addressing modes
- fmod all addressing modes
- frem all addressing modes
- fscale all addressing modes
- fsin all addressing modes
- fsincos all addressing modes
- fsinh all addressing modes
- ftan all addressing modes
- ftanh all addressing modes
- ftentox all addressing modes
- ftwotox all addressing modes
-
- The following are handled with special emulation routines within MuRedox:
-
- fmovecr all constants, with special versions for 0,1 and 10.
- fdbcc all addressing modes, 68060 only, native on 68040
- fscc all addressing modes, 68060 only, native on 68060
-
- Supported floating point addressing modes:
-
- All including #immediate.x(extended precision) for which emulation routines
- are generated as well for the 68060, and excluding "packed decimal".
-
- Instructions that get not patched over:
-
- ftrapcc 68060 only, native on 68040
- Mainly useful for debugging, this instruction causes an
- exception if a special FPU condition is met. A debugger
- might depend on this specific instruction, therefore this
- is left alone.
-
- fmovem.x with dynamic register list (in CPU registers), 68060 only,
- native on 68040.
- fmovem.l multiple control registers with immediate operand and more
- than one control register, 68060 only, native on 68040.
-
- I haven't seen any program using these instructions. If you
- need them emulated, prove me that they get used by running
- FPSPSnoop.
-
- Addressing modes that are not replaced by emulation code:
-
- #immed.p (packed decimal) and any other packed decimal operation.
-
- Since these instructions are slow even on a 68881/82 CPU,
- they are hardly ever used in speed-critical situations. If
- you really need them emulated, please tell me the program
- that uses them.
- ______________________________________________________________________________
-
- So long,
-
- Thomas Richter (August 2001)
-
-
- ============================= Archive contents =============================
-
- Original Packed Ratio Date Time Name
- -------- ------- ----- --------- -------- -------------
- 5976 3267 45.3% 19-Aug-01 17:15:10 +FPSPSnoop
- 43632 23491 46.1% 19-Aug-01 17:14:34 +68040.library
- 64804 30392 53.1% 19-Aug-01 17:14:38 +68060.library
- 17240 10053 41.6% 19-Aug-01 17:14:44 +disassembler.library
- 43976 23311 46.9% 19-Aug-01 17:14:26 +mmu.library
- 9548 5689 40.4% 26-Aug-01 13:36:28 +MuRedox
- 22975 8304 63.8% 26-Aug-01 13:38:12 +MuRedox.guide
- 523 273 47.8% 26-Aug-01 13:38:28 +MuRedox.guide.info
- 1083 553 48.9% 26-Aug-01 13:38:28 +MuRedox.info
- 14829 5602 62.2% 25-Aug-01 14:22:30 +MuRedox.readme
- -------- ------- ----- --------- --------
- 224586 110935 50.6% 28-Aug-101 02:45:54 10 files
-